Fixation index

The fixation index (FST) is a measure of population differentiation, genetic distance, based on genetic polymorphism data, such as single-nucleotide polymorphisms (SNPs) or microsatellites. It is a special case of F-statistics, developed in the 1920s by Sewall Wright.

Contents

Definition

The fixation index, FST, is simply a measure of the diversity of randomly chosen alleles within the same sub-population relative to that found in the entire population. It is often expressed as the proportion of genetic diversity due to allele frequency differences among populations.[1]

This comparison of genetic variability within and between populations is frequently used in the field of population genetics. The values range from 0 to 1. A zero value implies complete panmixis; that is, that the two populations are interbreeding freely. A value of one would imply the two populations are completely separate.

Several definitions of FST have been used, all measuring different but related quantities. A common definition is:[2]

 F_{ST} = \frac{ \Pi_\text{Between} - \Pi_\text{Within} } { \Pi_\text{Between} }

where  \Pi_\text{Between} and  \Pi_\text{Within} represent the average number of pairwise differences between two individuals sampled from different ( \Pi_\text{Between} ) or the same ( \Pi_\text{Within} ) population. The average pairwise difference within a population can be calculated as the sum of the pairwise differences divided by the number of pairs. Note that when using this definition of FST, the value  \Pi_\text{Within} should be computed for each population and then averaged. Otherwise, random sampling of pairs within populations put all the weight on the population with the largest sample size.

The measure FST has been heavily criticized as a measure for differentiation and it has been suggested that D is a better measure on both statistical and theoretical grounds.[3][4] However, its continued use has been supported under certain circumstances[5].

Differentiation can be assessed using online tools, some of which are listed below.

FST in humans

The International HapMap Project estimated FST for three human populations using SNP data. A more complex formula for FST was used in order to account for differences in sample size:

 F_{ST} = 1 - \frac{ \Pi_\text{Within} } { \Pi_\text{Between} } = 1 - \frac{ \Big [ \displaystyle\sum_{j} {n_j \choose 2} \displaystyle\sum_{i} 2 \frac{ n_{ij} } {n_{ij} - 1} x_{ij} (1 - x_{ij}) \Big ] / \displaystyle\sum_{j} {n_j \choose 2} } { \displaystyle\sum_{i} 2 \frac{ n_{i} } {n_{i} - 1} x_{i} (1 - x_{i}) }

In the above equation xij is the estimated frequency (proportion) of the minor allele at SNP i in population j, nij is the number of genotyped chromosomes at that position, and nj is the number of chromosomes analysed in that population. The lack of the j subscript in the denominator indicates that statistics ni and xi are calculated across the combined data sets.

Across the autosomes, FST was estimated to be 0.12. The significance of this FST value in humans is contentious. As an FST of zero indicates no divergence between populations, whereas an FST of one indicates complete isolation of populations, Anthropologists often cite Lewontin's 1972 work which came to a similar value and interpreted this number as meaning there was little biological differences between human races.[6] On the other hand, while an FST value of 0.12 might be lower than that found between populations of many other species, Henry Harpending pointed out that this value implies on a world scale a "kinship between two individuals of the same human population is equivalent to kinship between grandparent and grandchild or between half siblings".[7]

Autosomal genetic distances based on SNPs

Intercontinental autosomal genetic distances based on SNPs[8]
Europe (CEU) Sub-Saharan Africa (Yoruba) East-Asia (Japanese)
Sub-Saharan Africa (Yoruba) 0.153
East-Asia (Japanese) 0.111 0.190
East-Asia (Chinese) 0.110 0.192 0.007
Intra-european/mediterranean autosomal genetic distances based on SNPs[9]
Palestinians Greeks Italian Spanish Basque Irish German Russian
Greeks 0.0057
Italian 0.0064 0.0001
Spanish 0.0101 0.0035 0.0010
Basque 0.0199 0.0098 0.0084 0.0060
Irish 0.0170 0.0067 0.0048 0.0037 0.0086
German 0.0136 0.0039 0.0029 0.0015 0.0079 0.0010
Russian 0.0202 0.0108 0.0088 0.0079 0.0126 0.0038 0.0037
Swedish 0.0191 0.0084 0.0064 0.0055 0.0100 0.0020 0.0007 0.0036

Programs for calculating FST

Modules for calculating FST

References

  1. ^ Holsinger, Kent E.; Bruce S. Weir (2009). "Genetics in geographically structured populations: defining, estimating and interpreting FST". Nat Rev Genet 10 (9): 639–650. doi:10.1038/nrg2611. ISSN 1471-0056. PMID 19687804. 
  2. ^ Hudson, RR.; Slatkin, M.; Maddison, WP. (Oct 1992). "Estimation of Levels of Gene Flow from DNA Sequence Data". Genetics 132 (2): 583–9. PMC 1205159. PMID 1427045. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1205159. 
  3. ^ L. Jost. (2008). G(ST) and its relatives do not measure differentiation. Molecular Ecology, 17, [1]
  4. ^ L. Jost. (2009). D vs. G(ST): Response to Heller and Siegismund (2009) and Ryman and Leimar (2009). Molecular Ecology, 18, [2]
  5. ^ N. Ryman and O. Leimar. (2009). G(ST) is still a useful measure of genetic differentiation - a comment on Jost's D. Molecular Ecology, 18, [3]
  6. ^ Lewontin, R. C. (1972). "The apportionment of human diversity". Evolutionary biology 6 (38): 381–398. 
  7. ^ Harpending, Henry (2002-11-01). "Kinship and Population Subdivision". Population & Environment 24 (2): 141–147. doi:10.1023/A:1020815420693. ISBN 1020815420693. 
  8. ^ Nelis, Mari; et al (2009-05-08). Fleischer, Robert C.. ed. "Genetic Structure of Europeans: A View from the North–East". PLoS ONE 4 (5): e5472. Bibcode 2009PLoSO...4.5472N. doi:10.1371/journal.pone.0005472. PMC 2675054. PMID 19424496. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2675054. , see table
  9. ^ Tian, Chao; et al. (2009-11). "European Population Genetic Substructure: Further Definition of Ancestry Informative Markers for Distinguishing among Diverse European Ethnic Groups". Molecular Medicine 15 (11–12): 371–383. doi:10.2119/molmed.2009.00094. ISSN 1076-1551. PMC 2730349. PMID 19707526. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2730349. , see table
  10. ^ N. G. Crawford. (2010). smogd: software for the measurement of genetic diversity. Molecular Ecology Resources, 10, [4]

External links